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Keywords: Coviewing is a commonly recommended practice, but little is known about how coviewing impacts children's 

Coviewing educational media viewing experience. We investigated how coviewing impacts attention and comprehension of 

Attention educational media, as well as the role of baseline vocabulary in understanding these associations. Eighty-three 

Comprehension preschoolers viewed two videos on an eye-tracker — one with an adult coviewer and the other without. Children's 

ue mene baseline vocabulary, attention, and comprehension were assessed. Results indicated that coviewing benefited 
visual attention. Neither coviewing condition nor attention, however, predicted children's comprehension. 
Instead, comprehension was predicted by age, vocabulary, and an interaction between coviewing condition, 
vocabulary, and attention. The interaction revealed that comprehension was stronger in the coviewing condition 
than the noninteractive condition only when children also had stronger visual attention to the program and 
larger vocabularies. Results suggest that coviewing benefits attention, but that both attention and child language 
are integrally tied to whether coviewing predicts comprehension. 

Introduction educational media has been largely inconclusive — some studies have 


Young children are avid consumers of media in today's society — 
with children aged two to four watching over 2 h of television per day 
(Rideout, 2017). Fortunately, media targeting preschoolers often have 
educationally relevant goals, and preschool-aged children are skilled at 
comprehending and learning from these educational media programs, 
even when viewing media alone (Mares & Pan, 2013; Takacs, Swart, & 
Bus, 2015). Nonetheless, recommendations regarding children's screen 
media use, such as those by the American Academy of Pediatrics, 2016, 
suggest that parents should coview media with preschoolers to help 
them better understand what they see. Coviewing may be beneficial in 
many ways, including allowing parents to discuss and potentially mi- 
tigate any harmful effects of exposure to violence or risk-taking beha- 
viors in media programming. However, in the context of educational 
media — programs that have the explicit intent to teach children a 
school-related skill rather than be primarily entertaining (Vandewater 
& Bickham, 2004) — it is less clear if coviewing enhances the viewing 
and educational experience for young children. In our study, we focus 
on the role of coviewing in enhancing the learning environment of 
video programs viewed on screen-based educational media platforms, 
such as television, streamed videos, iPads, and smart phones. 

In fact, past research on the learning benefits of coviewing 
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found learning benefits to coviewing over viewing media alone, while 
others have found no added learning benefit to coviewing (Reiser, 
Tessmer, & Phelps, 1984; Skouteris & Kelly, 2006; Strouse, O'doherty, & 
Troseth, 2013). Additionally, little work examines how coviewing 
might impact the processes involved with media consumption and 
comprehension, such as child attention. Attention is necessary though 
not sufficient for understanding the content displayed on screen (Smith, 
Colunga, & Yoshida, 2010), so increasing our understanding of how 
coviewing impacts visual attention might help us learn more about the 
proximal effects of coviewing on children's viewing experience. Much 
like the lack of process-level data, little work examines how coviewing 
interacts with child characteristics, such as baseline vocabulary size. In 
order to comprehend the narrative of an educational media program, 
children need not only attention to the media, but also the language 
necessary to understand the media narration along with the added 
coviewer speech. The present study therefore investigates how cov- 
iewing impacts attention and comprehension, as well as the role of 
child baseline vocabulary in understanding these associations. 

The overall goal of the present study is to add to our understanding 
of the coviewing process by investigating how a clearly defined, edu- 
cationally-relevant form of coviewing — one that reflects processes that 
are commonly seen in parent-child interactions — impacts low-income 
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preschoolers' attention to and comprehension of educational media. We 
chose to focus on a sample of preschoolers from lower-income house- 
holds in order to better understand a potentially supportive context for 
the children who may be in greater need of additional scaffolds. Prior 
research has shown consistent differences in language processing, 
production, and comprehension based on socioeconomic differences 
from an early age (Ginsborg, 2006; Pace, Luo, Hirsh-Pasek, & Golinkoff, 
2017). At the same time, children from lower socioeconomic status 
households tend to consume more media than their peers (Rideout, 
2017). Considering the human resources needed for coviewing and the 
inconsistent findings of prior research on the benefits of coviewing, 
understanding how best to invest resources to support the compre- 
hension of children from lower-income households is particularly im- 
portant. Within our sample, we therefore also investigate the role of 
children's baseline vocabulary in predicting attention and comprehen- 
sion. Prior research has found that extant vocabulary plays an im- 
portant role in predicting learning from media (e.g. Blewitt, Rump, 
Shealy, & Cook, 2009). As such, in a sample at risk for weaker language 
skills, we recognized the need to understand if the added language 
input of a coviewer might differentially support children based on the 
language competence they initially bring to the experience. In other 
words, we investigated whether coviewing might be a stronger support 
for children with high enough language skills to process two sources of 
language input. We therefore investigated the potential interactions 
between coviewing, attention, and baseline vocabulary that could help 
illuminate the circumstances under which coviewing relates to chil- 
dren's comprehension of educational media. 


Coviewing educational media 


There are multiple pathways through which coviewing might ben- 
efit attention and comprehension of media. Salomon (1977), for ex- 
ample, discussed how parent-child coviewing promoted enjoyment of 
viewing for both parties, which in turn could support children's atten- 
tion to the program. Coviewers could additionally provide discussion 
and help as needed, thereby supporting comprehension of media con- 
tent. For example, providing children with repetition and elaboration of 
important plot information could enhance the amount of input they 
receive related to the plot content, supporting their level of rehearsal of 
the content and comprehension of it (Watkins, Huston-Stein, & Wright, 
1981). 

In spite of the comprehension support coviewing might provide, 
research on the impact of coviewing on preschoolers' comprehension 
has produced fairly inconsistent results. Reiser et al. (1984) found that 
three- and four- year-old children performed better on letter and 
number naming when adult coviewers asked the child to name the 
letters and numbers and gave contingent feedback during the educa- 
tional program compared to when viewing with a silent adult re- 
searcher. This study employed an intensive questioning approach to 
coviewing, however, that is unlikely to be found under more natur- 
alistic circumstances. Similarly, in a study of three-year-olds’ video- 
storybook comprehension, Strouse et al., 2013 investigated two forms 
of coviewing, and found that one form — an intensive coviewing inter- 
vention that trained parents to pause the video and engage in dialogic 
questioning with their child - resulted in greater comprehension than a 
control. However, Strouse et al. (2013) discussed that parents rarely 
employed these questioning techniques spontaneously. As such, in- 
tensive coviewing interventions that are characterized by relatively 
non-naturalistic methods of child questioning have often demonstrated 
benefits of coviewing for comprehension. 

Contrastingly, Strouse et al., 2013 found that their other studied 
coviewing enactment — one in which parents did not ask questions and 
instead directed their child's attention and discussed the program with 
them - did not promote comprehension. Similarly, other studies em- 
ploying more naturally-occurring coviewing enactments with three- 
and four-year-old children have yielded no overall benefits to 
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comprehension or learning for experimentally manipulated coviewing 
over viewing alone (Rasmussen et al., 2016; Skouteris & Kelly, 2006). 
Unfortunately, these naturalistic studies did not describe how parents 
enacted coviewing, and instructions given to parents were quite general 
(e.g. “talk to your child as much as possible about the show”). As such, 
even though these studies had the strength of employing naturalistic 
parent-child coviewing, the lack of information on enactment or pro- 
cess variables makes it difficult to understand the characteristics of 
coviewing that failed to produce comprehension gains for children or 
why that may have been. 

This presents a potential problem for policies such as that by the 
American Academy of Pediatrics that recommend coviewing. 
Policymakers rarely suggest specific strategies to use while coviewing, 
and the literature suggests that some of the strategies that parents use 
spontaneously may not be effective in supporting children's learning 
and comprehension. In fact, in a correlational study, Rice, Conti- 
Ramsden, and Snow (1990) found that viewing “Sesame Street” alone 
over two years was related to improved vocabulary gains, whereas 
viewing the show with an adult was unassociated with vocabulary 
improvements. This highlights a critical need to better understand how 
coviewing enactments that employ techniques that are typical of 
parent-child interactions impact not only comprehension, but also 
process variables (e.g. attention). 


Coviewing enactment of present study 


In the present study, we systematically investigate how a form of 
adult-child coviewing that incorporates elements commonly used in 
parent-child interactions influences both attention to and comprehen- 
sion of educational media. In order to develop our coviewing enact- 
ment, we therefore drew on research documenting naturally occurring, 
positive parent-child interaction elements during shared book reading. 
These elements included pointing, discussing important word mean- 
ings, making comments or connections to help children understand past 
story elements, and making comments related to opinions or reactions 
related to story content (Evans, Reynolds, Shaw, & Pursoo, 2011; Fisch, 
Shulman, Akerman, & Levin, 2002; Ninio & Bruner, 1978; Roser & 
Martinez, 1985). We incorporated the aforementioned elements into 
our enactment of coviewing. 

Another practice that often occurred during shared book reading 
was parents asking their child question questions (e.g. Fisch et al., 
2002). Questioning has been a central feature of prior coviewing re- 
search that has demonstrated learning benefits to coviewing (e.g. Reiser 
et al., 1984; Strouse et al., 2013). However, we did not utilize ques- 
tioning in our enactment due to the contextual differences between an 
educational media environment such as video and the traditional 
storybook environment. The pacing of traditional storybooks is self- 
determined, and pauses for discussion or questioning are easy to 
spontaneously embed within the interaction. Extensive questioning and 
discussion is not as well suited to a video viewing environment because 
the discussion tends to result in the child missing the content of the 
video that follows. It is therefore not a common practice to naturally 
pause a video to discuss its content. For example, the dialogic ques- 
tioning coviewing intervention studied by Strouse et al., 2013 involved 
training parents to pause the video in order to engage in extensive 
questioning. 

In order to maintain a more natural media consumption experience, 
we only utilized the elements of shared book reading that were best 
suited to the contextual constraints of the media environment - 
pointing, important vocabulary discussions, providing brief plot recaps 
and comments, reacting to the program content, and briefly elaborating 
on content. We provided a clearly defined, scripted, and focused cov- 
iewing procedure that represented a strong, educationally-relevant 
enactment of frequent parent-child interaction elements in a coviewing 
context. We therefore studied whether a rich use of these strategies 
would support the attention and comprehension of preschoolers who 
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were coviewing educational media with an adult. 

Within the context of our coviewing enactment, we additionally 
sought to investigate potential reasons for why educationally-relevant 
coviewing may not necessarily support children's comprehension of 
educational media. There are multiple reasons why coviewing might 
not benefit comprehension. In the present study, we focus on two 
possible interpretations: i) the visual interpretation — that children 
might become visually distracted by the coviewer when he/she speaks 
to the child. The detriments of lessened visual attention coupled with 
the benefits of increased audio input from a coviewer might ultimately 
level off, and result in few comprehension gains from coviewing. ii) The 
auditory interpretation — that the additional audio input provided by 
the coviewer may become overwhelming for children with weaker 
language skills, and therefore only potentially benefit comprehension 
for children with stronger baseline vocabularies. If child baseline lan- 
guage, such as vocabulary, plays a role in how effectively children can 
take advantage of the coviewing experience, failure to take this into 
consideration might limit the ability to detect potential comprehension 
benefits to coviewing for certain children. As such, both the visual and 
auditory interpretations suggest different potential pathways through 
which the link between coviewing and comprehension might be in- 
tegrally tied to attention and child baseline vocabulary. 


The visual interpretation — child attention to educational media 


Though some work has investigated the arousal processes associated 
with parent-child coviewing such as heart rate (e.g. Keene et al., 2019; 
Rasmussen, Keene, Berke, Densley, & Loof, 2017), little research has 
investigated how coviewing impacts child visual attention to an edu- 
cational media program. Two possible hypotheses emerge on how 
coviewing educational media might interact with attention. On the one 
hand, as Salomon, 1977 and Strouse et al. (2013) suggested, a coviewer 
might enhance child attention by providing a model of attention, in- 
creasing interest in the program, and/or directing child attention de- 
liberately. Studies by Keene et al. (2019) and Rasmussen et al. (2017) 
similarly suggest arousal patterns that reflect stronger engagement 
when viewing media with an adult compared to alone. 

Alternatively, children may be inclined to look away from the 
screen and towards the coviewer while the coviewer is talking. In a 
study of peer coviewing, Anderson et al., 1981 found that children 
viewing educational media in groups of three showed weaker visual 
attention to television than children viewing without peers. Looking 
away from the screen would inherently reduce visual attention at those 
points, and may potentially disrupt visual engagement with the content 
of the program more generally. Richards and Anderson (2004) discuss 
how attentional inertia - or sustained looking at the screen without 
looking away -consistently predicts learning from television. If a cov- 
iewer visually distracts the child from the screen by talking to the child, 
this might interrupt the flow of attentional inertia, which may in turn 
be detrimental to visual engagement with the program. 

In the present study, we therefore use eye-tracking to investigate 
how coviewing impacts preschoolers' attention to the screen. If our 
enactment of coviewing reduces child attention to the screen, this might 
be a potential explanation as to why such forms of coviewing rarely 
show benefits to comprehension. Nonetheless, prior research has rarely 
found coviewing to be detrimental to comprehension, so it is also 
possible that coviewing might have little effect on, or a positive effect 
on attention. If coviewing facilitates attention, the next step would be to 
ascertain if attention in turn predicts comprehension. 

Unfortunately, the connection between looking time and learning 
has not been reliably established (Kirkorian, Pempek, & Choi, 2017). In 
some studies, the total time spent looking at learning-related stimuli 
were associated with learning (e.g., Roseberry, Hirsh-Pasek, Parish- 
Morris, & Golinkoff, 2009), while in other cases, looking times did not 
predict learning (Schmitt & Anderson, 2002). Visual attention is only 
one of many forms of attention that can predict learning, and, as such, 
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enhanced visual attention alone might not directly translate to en- 
hanced comprehension. We therefore turn to the other sensory source 
of input — auditory input — and investigate the role of the child char- 
acteristic of baseline vocabulary in predicting the conditions under 
which coviewing might benefit comprehension. Since a coviewer is 
primarily a source of auditory input, children's language proficiency 
may be integrally tied to how effectively they can process and use the 
added input to support their comprehension. 


The auditory interpretation — child baseline vocabulary 


Our enactment of adult-child coviewing incorporated many audi- 
tory elements that are common in parent-child interactions, such as 
comments on past story events (Evans et al., 2011; Fisch et al., 2002; 
Ninio & Bruner, 1978). It is therefore possible that children's baseline 
language skills, such as vocabulary, play an important role in whether 
or not they are able to take advantage of the added auditory input. An 
influential theory that delineates this possibility is dual coding theory 
(Clark & Paivio, 1991; Paivio, 1986, 1990). 

Dual coding theory proposes that two different sensory modes of 
presentation of information (e.g., visual, auditory) promote learning of 
that information better than just one mode of presentation. This is 
because the two modalities are theorized to tap into different cognitive 
resources, and therefore not compete for the same limited processing 
resources. Thus, combining multiple modalities to teach the same 
content is beneficial to learning and comprehension. Educational screen 
media taps into both the visual and auditory channels, providing a more 
complete representation of the story content than one channel alone. 

Central to this theory is the notion that we have limited cognitive 
processing resources within a single modality. Once our processing 
resources are being fully utilized, additional input in the same modality 
would no longer be beneficial. In the context of coviewing educational 
media, the media itself provides both visual and auditory input, and the 
coviewer provides an additional source of auditory input. It is therefore 
possible that when children have weaker initial vocabularies, they may 
not be able to take advantage of the additional auditory input provided 
by the coviewer as their cognitive resources are being fully utilized to 
process the audio content of the media itself. 

Children with stronger vocabularies, however, may require fewer 
resources to process the audio from the media, and may be better able 
to take advantage of the additional auditory input provided by the 
coviewer. If this is the case, children might need sufficiently strong 
language skills to process coviewer sources of auditory input in order 
for coviewing to support comprehension over viewing alone. In the 
present study, we therefore investigate whether coviewing might in- 
teract with children's baseline vocabulary to predict comprehension of 
educational media. 

We additionally examine whether coviewing interacts with both 
vocabulary and attention to predict comprehension. Prior research 
shows that the attention might not only predict comprehension, but the 
reverse might also be found. Prior vocabulary knowledge and back- 
ground knowledge have been shown to support the visual attentional 
processes of children (Anderson, Lorch, Field, & Sanders, 1981; Kaefer, 
2018; Kaefer, Pinkham, & Neuman, 2017; Kaefer, Neuman, & Pinkham, 
2015). This suggests that baseline factors such as vocabulary may ul- 
timately work alongside visual attention to predict comprehension. 
Additionally, aligned with dual coding theory that emphasizes the ad- 
ditive nature of visual and auditory input, it may be that children need 
a combination of a stronger baseline vocabulary to successfully process 
audio input, and sufficient visual attention for visual processing in 
order for comprehension to be supported. We investigate these possi- 
bilities in the present study. 


The present study 


The present study focuses on a sample of low-income preschoolers 
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to extend the literature on coviewing, attention, baseline vocabulary, 
and comprehension. We investigate how a coviewing enactment in- 
corporating educational elements of parent-child interaction impacts 
visual attention and story comprehension. Children viewed one edu- 
cational media episode with, and another without an adult coviewer 
while being eye-tracked. Children's baseline vocabulary was assessed 
prior to viewing the videos, and children completed a comprehension 
assessment after viewing each video. Our study focused on the fol- 
lowing questions: 


i) Does coviewing impact visual attention to educational media? 
ii) Does attention predict comprehension of educational media? 
iii) Does coviewing benefit comprehension of educational media? 
iv) Does coviewing interact with attention and/or child baseline vo- 
cabulary to predict stronger comprehension? 


Overall, the present study aims to move beyond exclusively in- 
vestigating the direct influence of coviewing on comprehension to de- 
veloping a more nuanced understanding of the conditions under which 
coviewing might be more likely to support comprehension in pre- 
schoolers. 


Method 
Participants 


Participants were 83 three- and four-year old children 
(Mage = 4.3 years, SDage = 0.37 years; range = 42-59 months); 64% 
were female. Sample size determinations were made based on re- 
commendations by Morgan and Case (2013) who suggest that a con- 
servative sample size estimate for a repeated measures analysis of 
covariance can be approximated as a 44% reduction of the power es- 
timates for a two-sample t-test. Power analyses in G*Power (Faul, 
Erdfelder, Lang, & Buchner, 2007) for a 2-tailed, two-sample t-test with 
an estimated power of 0.8 suggested a sample size of 128. The 44% 
reduction resulted in a desired sample of 72 children in our study. We 
also used linear mixed modeling in our sample, and therefore used 
G*Power to determine the number of data points needed when using a 
model fully controlling for subject (the number of predictors in the 
model plus the number of participants, each representing a dummy- 
coded variable). A 2-tailed linear multiple regression model with 90 
predictors (83 subject plus 7 predictors) at a power of 0.8 required a 
minimum of 94 data points. Our sample comprised 83 participants, 
each with two data points, thereby meeting the power requirements of 
this analysis. 

Participating children were enrolled in two Head Start centers lo- 
cated in high poverty areas in a large urban city. The sample was di- 
verse: 29% were African American, 49% were Hispanic, 18% were West 
Indian, and 4% were Asian or biracial. Educational directors, teachers, 
and parents provided consent for participation. Children provided 
verbal assent. IRB approval was attained from New York University 
(IRB-FY2016-1251, Title: “Educational Media Support for Low-income 
Preschoolers' Vocabulary Development”). All children qualified for free 
and reduced lunch. Standardized receptive language scores, as mea- 
sured by the Peabody Picture Vocabulary Test-IV (PPVT-IV; Dunn & 
Dunn, 2007), averaged 79.64 (SD = 15.76), which is more than one 
standard deviation below the population mean. 


Research design 


We employed a within-subjects design in which participating chil- 
dren each viewed two videos — one in the coviewing condition, and the 
other in the non-interactive condition. In both conditions, children 
participated in the study individually in a one-on-one session with a 
researcher. The two conditions were run on the same day spaced ap- 
proximately one hour apart. Coviewing condition order and the video 
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used in each condition were counterbalanced between participants. 
Children were eye-tracked while watching both videos in order to de- 
termine the influence of coviewing on attention to the screen. Children 
were administered the PPVT-IV prior to viewing the educational videos 
as a baseline language indicator. 


Educational media episodes 


Since we were interested in studying how coviewing impacts com- 
prehension, we used two narrative videos that each focused on a dif- 
ferent science-related plot. Our educational media stimuli were devel- 
oped from two 9.5-min narrative videos from the television show Peep 
and the Big Wide World, a program teaching science concepts to 
3-5 year-old children. In one episode, the main characters found a 
beautiful flower far from home, and decided to grow their own flower 
closer to home. The bulk of the episode focused on the steps it took to 
grow the flower from the seed to a fully-grown plant. In the second 
episode, the characters were looking for buried treasure when they 
spotted a square in the sand. Upon pulling it out of the sand, they 
discovered it was actually a three-dimensional block (cube). They found 
more shapes in the sand, leading one character to think he had a unique 
treasure-finding ability. The characters learned more about the re- 
presentation of three-dimensional shapes throughout the episode. The 
episodes incorporated a strong narrative with clear visual representa- 
tions of main plot points, which aided in the assessment of compre- 
hension. 


Eye-tracking apparatus and data processing 


Researchers utilized the child-friendly Tobii T120 system, a remote 
eye-tracking system that has an infrared-based eye-tracker integrated 
into a computer with an LCD screen. The Tobii T120 samples at 120 Hz 
and has an accuracy of 0.5 visual degrees. Children's eye movements 
while viewing videos in the coviewing and non-interactive conditions 
were recorded on this eye-tracker. Children were calibrated prior to 
viewing each video using a 5-point manual calibration on screen. Data 
were processed and exported using Tobii Studio 3.0. 


Measures 


Peabody picture vocabulary test — fourth edition (PPVT-IV) 

(Dunn & Dunn, 2007). The PPVT is a validated, norm-referenced 
instrument that was used as a baseline assessment of receptive voca- 
bulary. In this assessment, children are asked to point to one of four 
image options that depicts a named word. The assessment provides both 
raw and age-standardized scores as an indicator of baseline vocabulary. 
Both scores are reported, though only the raw scores are used in the 
linear mixed model since age is added a separate predictor. 


Narrative story comprehension 

In order to assess children's narrative story comprehension, asses- 
sors showed children six screenshots from each video. Screenshots each 
depicted an important plot point in the video narrative, and were used 
to cue children's story recall. For each picture, children were asked, 
“What happened during this part?” Children provided their responses, 
and were given an additional prompt, “Anything else?” after their in- 
itial response was complete. Assessors wrote down children's responses 
verbatim for later coding. 

All child responses were transcribed, and a trained primary coder 
coded all comprehension responses by noting the number of accurate 
statements children made about the story (see Table 1 for examples of 
child responses and assigned codes). The primary coder was blind to the 
condition within which each video was viewed. Coded scores were 
summed across all six pictures to provide a measure of overall narrative 
story comprehension. A second trained coder independently coded 10% 
of responses, and inter-rater reliability was established at 
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Table 1 
Comprehension coding examples. 


Video Child comprehension response Score 


Shapes video 


Image: Characters find a “They saw a shape” 1 
shape in the sand. “They saw something./ They 2 
take it out.” 


“ry 


[he purple one is sad.” 0 (inaccurate) 


Seed video 
Image: character planting “They are just putting the seed.” 1 
the seed. “He put the seed in./ He put 3 
water/ and it growed up.” 
“He fell down.” 0 (inaccurate) 
Kappa = 0.96. 


Eye-tracking fixation duration 

We were interested in seeing how coviewing impacted visual at- 
tention to the screen while viewing the video. We extracted the total 
fixation duration children spent looking at the screen during the full 
video. Fixation durations were extracted using Tobii Studio 3.0 soft- 
ware. Fixations were defined as coordinates lasting 60 milliseconds or 
more, which were also identified by the fixation filter in the software 
program. Fixation durations, or the amount of time spent looking at a 
specific location, have frequently been used as an index of attention and 
processing of visual information (Just & Carpenter, 1980; Tsai, Hou, 
Lai, Liu, & Yang, 2012). 


Procedure 


Trained graduate student assessors administered all assessments 
individually to children in a quiet location at their preschool. Children 
first completed the PPVT-IV. They were then randomly assigned to a 
counterbalancing condition (condition order; video in each condition). 
On a later day, children completed both the coviewing and non-inter- 
active conditions spaced approximately one hour apart. In each con- 
dition, children were calibrated on an eye-tracker and watched one 9.5- 
min video while their eye movements were recorded. The compre- 
hension assessment for the relevant video was administered im- 
mediately following the video viewing in both conditions. The cov- 
iewing and non-interactive sessions each lasted 20 min. The coviewing 
and non-interactive conditions are described in detail below. 


Coviewing condition 

In the coviewing condition, children viewed the video clips with a 
trained graduate student on an eye-tracker monitor computer. In order 
to ensure the coviewing enactments were consistent, graduate student 
assessors were trained to follow a specific coviewing script for each 
video. The script was designed to engage the child in an educational 
manner while not being too disruptive. 


Coviewing elements 

The coviewing script incorporated interaction elements that re- 
quired the coviewer to provide additional information about important 
concepts or words in the story, make real-life connections, reiterate 
certain plot points, and display engagement with the program by re- 
acting to the program in a way that aligned with the content of the 
program (e.g., by laughing when something funny happens). An excerpt 
of the video and coviewing script can be found in Table 2. All of the 
aforementioned interaction elements were fully scripted to ensure 
consistency of implementation. Comments or questions from children in 
this study were extremely rare. Nonetheless, in the few cases where the 
child asked a question or made a comment, coviewers provided a short 
contingent response to acknowledge that they heard the comment, such 
as saying “yeah!”, “uh-huh” or “mmmm7” in an interested tone. 
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Non-interactive condition 

In the non-interactive condition, children viewed the video on the 
eye-tracker monitor without any adult interaction. Graduate student 
assessors told participating children that they would be watching a 
video and answering some questions afterwards. Assessors remained in 
the room to supervise the child, but made their presence less salient by 
sitting 10 ft away from the child and pretending to read a book. They 
did not make eye contact or interact with the child while the video was 


playing. 
Analysis 


The present study investigated how coviewing educational media 
with an adult impacted preschoolers' visual attention and story com- 
prehension. In order to analyze how coviewing impacted visual atten- 
tion, we conduced a repeated measures analysis of covariance with the 
dependent variable of visual attention, the within-subjects factor of 
coviewing condition (2: coviewing, noninteractive) and the (mean- 
centered) covariates of PPVT raw scores and child age. To answer our 
remaining research questions investigating whether attention, cov- 
iewing, or an interaction between attention, coviewing and/or PPVT 
scores predict comprehension, we conducted a two-level HLM with 
participant as the level-2 factor and coviewing condition as the re- 
peated measures factor. The model contained a random L2 intercept as 
well as the fixed predictors of child age, coviewing condition, PPVT raw 
score, fixation duration, and all two- and three-way interactions be- 
tween coviewing condition, PPVT scores, and fixation duration. There 
were no significant correlations between attention, age, and PPVT 
scores, verifying low multicollinearity among predictors. All covariates 
and predictors used in these analyses were mean-centered. Data were 
analyzed using IBM SPSS Statistics version 25. Post-hoc simple slopes 
analyses were conducted to interpret interactions using Stata version 
15. 


Results 


The present study investigated the connections between adult-child 
coviewing, visual attention to screen, baseline language proficiency, 
and comprehension. We specifically looked at i) whether adult-child 
coviewing impacted child attention, and ii) the potential predictors of 
comprehension including coviewing, attention, PPVT scores, and in- 
teractions. 


Preliminary analyses 


In order to determine whether condition order or video used in each 
condition would need to be used in further analyses, we first de- 
termined whether these counterbalanced variables affected children's 
comprehension or attention. There were no statistically significant 
differences in comprehension based on video, F(1, 81) = 3.72, 
p = .057 or condition order, F(1, 81) = 1.04, p = .312. Similarly, there 
were no significant differences in attention between the two condition 
orders, F(1, 67) = 2.88, p = .094 or videos, F(1, 67) = 0.37, p = .544. 
As such, video and condition order were excluded from all further 
analyses. 


Coviewing and visual attention 


We sought to understand how a coviewing affects child attention to 
educational screen media. We aimed to ascertain which of two com- 
peting hypotheses — one predicting that children would be distracted 
from the video by the coviewer, and the other suggesting that children's 
interest and therefore attention might be enhanced by a coviewer — 
would be supported by the data. We conducted a repeated measures 
ANCOVA with the within-subjects factor of coviewing condition (2: 
coviewing, noninteractive) and the covariates of PPVT raw score and 
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Table 2 
Excerpt from video and coviewing script. 
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Seeds video script 


Coviewer script 


Narrator: Peep did not give up. That's when he learned that waiting is hard to do. 


Narrator: Peep got water from the stream every day. He watered and waited... 


Narrator: for days... and days... and days... and days... and days... and days. Quack and Chirp 


began to worry. 
Chirp: Nothing is every going to grow. How can we help Peep? 


Quack: We have to dig up the seed and eat it, so he will give up. Quack: I can't find it! 


Chirp: Did he bury it by this green thing? 
Peep: That's it! It grew into a baby plant! 


It is hard, huh? 


Wow, he's working hard to get water from the stream! But that's what helps 
flowers grow! 
[laugh] 


Oh no! 


Cool! It started growing! They can see it now! 


age on children's attention to the educational videos. 

Children visually attended to a significantly greater percentage of 
the program when viewing with a coviewer (M = 73.35, SD = 18.97) 
than when viewing alone (M = 66.87, SD = 21.60), F(1, 68) = 9.29, 
Pp = .003. There were no significant main effects or interactions with 
child age or PPVT score, suggesting that attentional processes were 
similar regardless of baseline language proficiency and age in the 18- 
month range investigated in this study. See Table 3 for inferential sta- 
tistics. Overall, results suggest that an interactive adult coviewer did 
not visually distract children from the screen, but rather strengthened 
children's visual engagement with the educational media program. 
Additionally, this stronger visual attention was not related to the child 
characteristics of age or baseline vocabulary size. We next turned to 
whether attention, PPVT scores, and/or coviewing might predict com- 
prehension. 


Predictors of comprehension 


We next analyzed whether coviewing, visual attention, or an in- 
teraction between coviewing, attention and/or PPVT raw scores might 
impact comprehension. We conducted a two-level linear mixed model 
on children's comprehension, with a L2 factor of participant and a re- 
peated factor of coviewing condition. Fixed predictors in the model 
were child age, coviewing condition, PPVT raw score, fixation duration, 
and two- and three-way interactions between coviewing condition, 
PPVT scores, and fixation duration. This analysis revealed three sig- 
nificant predictors of children's comprehension of educational media: 
child age, t(73.26) = 2.07, p = .042, PPVT raw score, t(74.35) = 4.60, 
p < .001, and the three-way interaction between coviewing condition, 
fixation duration, and PPVT standard score, t(75.63) = 2.38, p = .020. 
Children received higher comprehension scores with increasing age, r 
(166) = 0.30, p < .001, and PPVT scores, r(166) = 0.40, p < .001. 
There were no significant differences in comprehension between the 
coviewing (M = 9.01, SD = 5.34) and noninteractive (M = 8.83, 
SD = 4.83) conditions, t(71.31) = 0.003, p = .998. Statistics related to 
predictors in this model are in Table 4, and correlations between pre- 
dictors can be found in Table 5. 

In order to interpret the 3-way interaction, data were graphed by 


Table 3 
Main effects and interactions for attention by coviewing condition. 


Table 4 

Estimates and significance of HLM predictors. 
Predictors Estimate Standard df t Sig. 

error 
Coview condition < .001 21 71.43 003 .998 
PPVT* 12 .03 74.35 4.60 <.001°* 
Age 2.75 1.33 73.26 2.07 .042 
Fixation duration -03 -02 131.28 1.75 .083 
Coview condition by 01 01 72.16 95 348 
fixation duration 

Coview condition by PPVT 01 01 69.83 76 448 
PPVT by fixation duration -001 -001 115.42 1.17 .243 
Coview condition by PPVT -002 -001 75.63 2.38  .020 


by fixation duration 


*p < .05. 


Table 5 
Correlations between comprehension and model predictors. 


Variables 1 2 3 4 5 6 


1. Comprehension - 


2. Age 30 - 

3. PPVT raw score 50 16 - 

4. Fixation duration -20 13 .08 - 

5. Coview condition -02 - - 19 - 
* < 0.05. 

~ < 0.01. 


plotting the comprehension scores of children based on attention to 
screen and PPVT scores. Fig. 1 shows comprehension scores for each 
coviewing condition based on a median split of coviewing visual at- 
tention and PPVT scores. When comparing the coviewing to the non- 
interactive condition, comprehension in the coviewing condition began 
to surpass the noninteractive condition only when children had both 
stronger vocabularies and higher attention to media, t(20) = 2.01, 
p = .059. Little difference was observed by coviewing condition when 
both vocabulary and attention were low, t(19) = 0.41, p = .689. When 
children had only one of the two characteristics (attention or vocabu- 
lary), Fig. 2 their comprehension scores were actually stronger in the 


Coview vs. noninteractive condition main effects and interactions 


Dependent variable Contrast F 

Percent fixation duration on overall video Coview Condition 9.29 
Coview by Age 26 
Coview by PPVT 1.06 
Age 85 
PPVT 17 


df Sig. MSggfect SSkrror MSgrror 
1/68 -003 1537.42 11252.30 165.48 
1/68 .614 42.37 

1/68 306 175.69 

1/68 .359 571.90 45522.27 669.45 
1/68 .680 114.95 


*p < .05. 
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Lower PPVT (<85) & Low 


Attention Attention 


Lower PPVT (<85) & High 


Higher PPVT (85 +) & Low Higher PPVT (85 +) & High 
Attention Attention 


Fig. 1. Children's comprehension with lower and higher attention to coviewing video (median split) by coview condition in each of two PPVT groups (standard score 


below vs. within 1 SD of 100). 


noninteractive condition. 

We conducted an analysis of whether the simple slopes of the as- 
sociation between comprehension and PPVT scores varied in different 
combinations of attention and coviewing condition. Pairwise compar- 
isons between slopes on the six comparisons revealed two significant 
differences between slopes. Specifically, the children in the coviewing 
condition with stronger attention to video had a steeper slope relating 
comprehension and PPVT compared to children with i) high attention 
in the noninteractive condition (t = 3.04, p = .003), or ii) low attention 
in the coviewing condition (t = 2.02, p = .048). The final group — 
children with low attention in the noninteractive condition did not 
significantly differ in slope from any other group. As such, we found 
that the combination of coviewing, high attention, and high vocabulary 
showed the clearest connection to comprehension. Overall, these results 
suggest that in order for coviewing to positively predict comprehension, 


children needed to have both a high enough baseline vocabulary and 
strong enough attention to video to take advantage of it. Having low 
attention to the video and/or low language proficiency corresponded 
with coviewing no longer benefiting comprehension. 


Discussion 


The present study examined how coviewing educational media 
impacted visual attention and story comprehension in a sample of low- 
income preschoolers. We found that coviewing heightened visual at- 
tention to the program, but neither attention nor coviewing directly 
predicted comprehension. Further explorations revealed that compre- 
hension was stronger when coviewing versus viewing independently 
only when children had both strong enough vocabularies and high 
enough visual attention to take advantage of the coviewing experience. 
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Fig. 2. Simple slopes of comprehension by PPVT scores for the coview and noninteractive conditions when children had very low and very high attention. 
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Our results align with prior research that has found inconsistent 
results regarding how coviewing impacts comprehension. Past work 
found that coviewing often did not benefit comprehension when nat- 
uralistic forms of coviewing were used (e.g. Rasmussen et al., 2016; 
Skouteris & Kelly, 2006). We offered two competing predictions related 
to attention as to why these forms of coviewing failed to provide added 
benefit over viewing alone. One prediction was that the coviewer dis- 
tracts children from the video. Thus, visual attention to the video might 
be lower, leveling out any potential benefit of the added auditory input. 
Our alternative hypothesis was that, aligned with the interpretations of 
Salomon, 1977 and Strouse et al., 2013, coviewing actually enhances 
visual attention with the program. Our results supported the latter 
hypothesis- that attention was supported by coviewing. As such, lower 
attention did not seem like a valid reason for a lack of coviewing in- 
fluences on comprehension. 

However, subsequent analyses revealed that higher attention did 
not necessarily result in higher comprehension. Prior research has 
suggested a tenuous association between visual attention and learning 
(Kirkorian et al., 2017), and our study similarly confirmed that atten- 
tion did not directly predict comprehension. Similarly, comprehension 
did not differ based on coviewing condition in our study. As such, even 
though visual attention was heightened as a result of coviewing, this 
did not necessarily translate into stronger comprehension. 

Ultimately, we found that coviewing predicted stronger compre- 
hension specifically when children had both higher attention to the 
screen and relatively high baseline vocabularies. Aligned with dual 
coding theory (Clark & Paivio, 1991; Paivio, 1990), these results sug- 
gest that having two sources of simultaneous auditory input (the media 
and the coviewer) may have been taxing on the auditory processing 
resources of children, particularly those with weaker vocabularies. 
When children had stronger baseline vocabularies, they may have been 
more likely to process and take advantage of both channels of auditory 
input. This alone was insufficient for coviewing to predict compre- 
hension in our study, however. This visual channel also needed to have 
high input (reflected in higher visual attention to the video) in order for 
children with stronger vocabularies to demonstrate stronger compre- 
hension in the coviewing condition. 

As such, attention alone was not a missing link between coviewing 
and comprehension, but rather one of multiple factors that contributed 
to whether coviewing was more likely to predict comprehension. 
Children in our sample were from low-income backgrounds and had 
relatively weak vocabularies overall. Even children in our higher lan- 
guage group had mean vocabulary standard scores that were two-thirds 
of a standard deviation below the population mean. For these children 
with average to below-average language skills, neither language nor 
attention alone was not enough to support comprehension — both were 
necessary for coviewing to positively predict comprehension. 

Situated within the coviewing literature, the present study suggests 
that some of the inconsistencies in prior work might relate to the role of 
attention, the style of coviewing, and child vocabulary. Prior work on 
naturalistic coviewing (e.g. Rasmussen et al., 2016; Skouteris & Kelly, 
2006), much like our study, showed no overall comprehension benefit 
to coviewing. The present study suggests that a possible reason this 
style of coviewing may not demonstrate comprehension benefits is that 
the efficacy of coviewing interacts with child and process variables. 
Only under certain circumstances does coviewing predict stronger 
comprehension. 

Our enactment of coviewing did not incorporate two elements that 
have been studied in past research — personalized, continent interac- 
tions, and questioning techniques. Prior studies on personalized, ques- 
tioning-intensive coviewing interventions (Reiser et al., 1984; Strouse 
et al., 2013) have often shown benefits for comprehension. This in- 
struction-heavy approach that incorporates extensive questioning and 
feedback may consistently improve learning, but is unlikely to be used 
spontaneously by parents. The present study found that an enactment of 
coviewing that used non-questioning educational interactions is 
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insufficient to promote comprehension by itself. 

In combination with the prior coviewing literature, our study sug- 
gests that children with varied background characteristics may be most 
likely to benefit from a more intensive coviewing approach than the one 
we studied — an enactment that is both personalized and questioning- 
focused, such as the dialogic questioning enactment studied by Strouse 
et al. (2013). Additionally, in order to allow for auditory processing 
time, pausing the video to have these discussions surrounding content is 
likely to benefit children. Unfortunately, these conditions are unlikely 
to reflect the spontaneous coviewing landscape. The enactment in our 
study was also more likely to reflect interactions used more sponta- 
neously by parents when educationally coviewing — and it seemed po- 
tentially beneficial for only a subset of children. 

As such, our study suggests that general recommendations to ac- 
tively coview educational media with children may not always enhance 
comprehension without additional guidance on exactly how to coview. 
Even the clearly educational enactment of coviewing in our study did 
not produce an overall benefit. A less focused enactment that might 
occur in natural contexts where parents are unfamiliar with techniques 
to enhance learning may not produce the intended learning benefits to 
coviewing educational media. Rice et al. (1990) found that viewing 
educational media with an adult failed to predict learning, though 
viewing alone did. As such, the present research highlights the im- 
portance of providing parents and teachers guidance on the strategies to 
use while coviewing, as an unfocused enactment of coviewing may not 
generate the intended return-on-investment. 

Overall, the present study extended our understanding of coviewing 
by investigating how coviewing impacts attention, as well as how at- 
tention and baseline vocabulary interact with coviewing to predict 
comprehension. Nonetheless, our study had some limitations. Our 
findings may not be generalizable to all coviewing partners (e.g. par- 
ents and teachers), since our study utilized researchers unfamiliar to the 
child. However, using researchers as coviewers allowed for an in- 
vestigation of a clearly defined intervention with fewer distractions. 
Additionally, the strategies used in our enactment were quite common 
in typical parent-child shared book reading (e.g. Evans et al., 2011; 
Fisch et al., 2002; Ninio & Bruner, 1978). Finally, our results followed 
similar patterns to prior work studying similar enactments with parent- 
child dyads, suggesting that our study likely resembled a natural cov- 
iewing situation. 

A second limitation is the restricted nature of our sample as well as 
our sample size. We studied only 83 children from low-income back- 
grounds with relatively weak baseline vocabularies approximately one 
standard deviation below the norm. This may limit generalizability 
populations with stronger language skills, and caution should be taken 
when generalizing findings from one study alone due to the sample size 
and variations between samples. For our sample, both visual attention 
and stronger PPVT scores interacted with coviewing to predict com- 
prehension. Fewer or different components of the interaction may be 
needed for children from more advantaged backgrounds or for children 
with language skills that are very strong. Nonetheless, some prior work 
on coviewing (e.g. Strouse et al., 2013) has focused on relatively high- 
income, educated samples and has found that comprehension was still 
not directly benefited by coviewing enactments similar to ours. As such, 
it is likely that processes such as individual differences in attention and 
language proficiency are relevant to varied samples, though they may 
not show identical associations as our study. 

Another limitation of our study was that we did not assess com- 
prehension of concepts prior to viewing the video. It is therefore pos- 
sible that some children had greater understanding of one video com- 
pared to another. We did randomly assign children to the specific video 
viewed in each condition. As such, possible variations in prior knowl- 
edge is likely to add to random rather than systematic noise in the data. 

Finally, our coviewing enactment did not incorporate any ques- 
tioning techniques, and it was not personalized to match individual 
children's experiences. Parents may link programs to their child's 
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personal lives more than our study enactment. However, the primary 
limitation of parent-child research on coviewing is a lack of clarity on 
enactment — which is something the present study was able to provide. 
We were also able to investigate a clearly educationally focused inter- 
action that did not include questioning techniques. 

Parents are often recommended to coview media with their children 
(e.g. American Academy of Pediatrics, 2016). The present study sug- 
gests that the benefits of an educational, but non-intensive enactment of 
coviewing may not be ubiquitous. Within an educational media cov- 
iewing context, we suggest that practices such as coviewing need to be 
considered in combination with the child's language and attention. 
Coviewing in a less intensive manner may predict comprehension for 
children with adequate language skills and attention. However, for 
children with less developed skills, our enactment of coviewing may 
have produced an overwhelming rather than supportive language en- 
vironment. As such, an educational but non-intensive form of coviewing 
may not be a high-yield practice to boost comprehension of educational 
media for children who are still developing the necessary language or 
attentional skills to take advantage of it. 
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